Structure and evolution of online social relationships: Heterogeneity in warm 

discussions 
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With the advancement in the information age, people are using electronic media more frequently 
for communications, and social relationships are also increasingly resorting to online channels. While 
extensive studies on traditional social networks have been carried out, little has been done on 
online social network. Here we analyze the structure and evolution of online social relationships by 
examining the temporal records of a bulletin board system (BBS) in a university. The BBS dataset 
comprises of 1,908 boards, in which a total of 7,446 students participate. An edge is assigned to 
each dialogue between two students, and it is defined as the appearance of the name of a student in 
the from- and to-field in each message. This yields a weighted network between the communicating 
students with an unambiguous group association of individuals. In contrast to a typical community 
network, where intracommunities (intercommunities) are strongly (weakly) tied, the BBS network 
contains hub members who participate in many boards simultaneously but are strongly tied, that is, 
they have a large degree and betweenness centrality and provide communication channels between 
communities. On the other hand, intracommunities are rather homogeneously and weakly connected. 
Such a structure, which has never been empirically characterized in the past, might provide a new 
perspective on social opinion formation in this digital era. 

PACS numbers: 



I. INTRODUCTION 



With the advancement in the information age, people 
are using electronic media for communication more fre- 
quently, and social relationships between people are also 
increasingly resorting to online communications. For ex- 
ample, the advent of online bulletin board systems (BBS) 
made it possible to develop a new type of online social 
relationship and social consensus. Very similar to the 
Usenet service, which was fairly popular during the ear- 
lier days of the Internet, BBS is based on the commu- 
nication between people sharing common interests; the 
topic of interest is usually identified by the board itself. 
People with common interests post messages on a cer- 
tain board and a response is conveyed by posting an- 
other message, thereby forming a thread. Thus, a thread 
in the BBS roughly represents a dialogue between peo- 
ple, and such a dialogue constitutes the basic relationship 
among the people participating in it. In the BBS, dia- 
logues or discussions usually proceed with little restric- 
tion on message writing and discrimination based on per- 
sonal information, thereby forming the so-called "warm 
discussions" as described in psycho-sociology Q . There- 
fore, the pattern of such online social relationships may 
be different from that of traditional social relationships 
based on face-to-face contact or online communication 
involving exchange of personal information, such as e- 
mail transactions 0, 13, 0> IE l|| and instant messaging 0. 
Thus, it would be interesting to study the structure of 
online social relationship networks constructed by people 
in warm discussions; this would be useful in resolving di- 
verse sociological and political issues and understanding 



the manner in which social opinion is formed in the digi- 
tal era H, U 0, 0, 0] . Extensive studies on traditional 
social networks have been carried out 0, 0, ; how- 
ever, few studies exist on online social networks. Here, 
we investigate the structure of online social networks by 
studying BBS networks, which are familiar to university 
students. 



From the graph theoretical perspective, the BBS net- 
work offers distinct features such as weighted and modu- 
lar network structure. Since the number of times a given 
pair of people exchange dialogues can be counted explic- 
itly, a weighted network is naturally obtained ^(| . More- 
over, since people are sharing a board corresponding to 
their common interests, BBS provides an unambiguous 
way of defining modules or communities This is 

unlike other examples of accessible protocols, including 
the sibling/peer relationship in the online community |l8| 
and trackback in the blog system In fact, the BBS 
network constructed by us differs in crucial aspects from 
other affiliation networks such as the collaboration net- 
work J2jj and student course registration network |2l| . In 
these examples, the relationship between people is not ex- 
plicitly defined but is indicated indirectly by their affilia- 
tion. Such an indirect definition generates several cliques- 
completely connected subgroups-which may result in an 
artifact particularly in the case of large-sized affiliations. 
Thus, to obtain a network of people with explicit pairwise 
interaction strength together with a distinct community 
definition is crucial for an appropriate description of the 
social system. The BBS network provides such ingredi- 
ents. 
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FIG. 1: Schematic network snapshots of the BBS network (a) 
and traditional social network (b). 



II. CONCLUSIONS AND DISCUSSION 

The BBS network has interesting structural features 
and implications. It contains hub members who partici- 
pate in dialogues across a large number of boards, thereby 
connecting one group of people at one board to another 
group at a different board. Further, their degrees, which 
are the numbers of people they have exchanged dialogues 
with, are large, thereby influencing other people through- 
out different communities. As a result, the hub mem- 
bers act as weak ties in connecting different communi- 
ties; however, their links are strong during on actual ac- 
tivity. On the other hand, intraboard connections are 
rather homogeneous in degree. Such a network feature is 
in contrast to traditional social networks maintained by 
the ties bridging disparate communities, which tend to 
be weak 0] . The difference is schematically depicted in 
Fig. 1. In the BBS network, the strength s, i.e., the total 
number of dialogues each individual participates in has a 
nonlinear relationship with the degree k as s ~ k . This 
implies that the hub members tend to post messages at 
considerably more frequently than the other people with 
small degrees. The neutrality in the assortative mixing 
is another feature of the BBS network compared with 
the assortativity in traditional social networks. Such a 
behavior may originate due to the absence of personal 
information on the partner during online social commu- 
nication. Thus, hub members are democratic in their 
connections to the remaining people, and they are in- 
deed "ubiquitous persons." Since the hub members play 
a dominant role in providing communication channels 
across different boards, it might be more efficient to use a 
BBS-like online media for persuading people and drawing 
social consensus than traditional social networks based 
on person-to-person relationships. We attempt to under- 
stand the BBS network from the perspective of a simple 
network model. In the model, we take into account the 
empirical fact that the BBS network contains groups of 
which size are inhomogeneous. In addition, the link den- 
sity of each group is not uniform, however decreases with 
increasing group size, which has been usually neglected 
in constructing model. 

It would be interesting to implement the present work 
in the context of a previous study involving a psycho- 
sociological experiment on group discussions and the re- 
sulting consensus Q, in which, group discussions are dis- 



tinguished into two types, "warm" and "cold". In the 
former type, people express their thoughts freely without 
any restriction, while in the latter, group discussions are 
restricted by some constraint either explicitly or implic- 
itly, for example, the hierarchy in group members. The 
experimental study concludes that the consensus mea- 
sured after group discussions can be different from that 
before the discussions depending on the type. In the for- 
mer, the consensus after discussions shifts to an extreme 
opinions, while in the latter, it leads to a trade-off aver- 
age group consensus. From the perspective of the exper- 
iment, we might state that the dialogues in the BBS are 
warm because no restriction is imposed on posting mes- 
sages and little information on the personal background 
of the partner is provided. Thus, the dialogues in the 
BBS may lead to radicalized consensus, violent group be- 
haviors, or imaginative and creative solutions to a given 
issue. Since students still in the process of developing a 
value system are vulnerable to negative influences, and 
have more opportunities to be influenced by their peers 
through online networks in this digital era than in the 
past, the proposed network pattern we report here will be 
useful in guiding them in the right direction. Moreover, 
the BBS network data will be helpful in understanding 
the manner in which diverse opinions are synchronized 
from the psycho-sociological perspective. 



III. BBS NETWORK 

We mainly examined the BBS system at the Korea Ad- 
vanced Institute of Science and Technology; it is named 
as loco .kaist . ac .kr. The characteristics of the net- 
work structure obtained from this BBS system also ap- 
pear in another system-bar .kaist . ac .kr. The data 
comprises records of all the threads posted from March 
9, 2000 to November 2, 2004, thus corresponding to a du- 
ration of around three and a half years. As of November 
2004, the system comprised 1,908 boards with a total of 
7,446 participating students. In order to ensure privacy, 
we are only allowed to access the information on "from," 
"to," the date of posting, and the name of the board it 
was posted on, for each message. Based on this informa- 
tion, we constructed the network between students such 
that for each message, an edge was assigned between two 
students appearing as "from" and "to." Alternatively, an 
arc (a directed edge) can be assigned for each message; 
however, we found that the communications are largely 
reciprocal: Approximately a half of the postings are ac- 
companied by another one with its from and to fields 
reversed, for example, a "Re:" message. Subsequently, 
we shall consider the network as undirected for simplicity. 

Our network construction naturally yields a weighted 
network in which the weight Wij of the edge between two 
students i and j is determined by the number of messages 
they exchanged during the period. The detailed statistics 
of the BBS are listed in Table I. 
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IV. STRUCTURE OF THE BBS NETWORK 

A. Student network 

The global snapshot of the student network in Fig. 1 
reveals the inhomogeneity among the students. The de- 
gree ki of a student i, which is the number of students 
he/she has exchanged dialogues with, is distributed ac- 
cording to a power law with an exponent of around —1 
followed by an exponential cutoff, as shown in Fig. 2(a). 
This feature is similar to that of the scientific collabora- 
tion network [2(j. The strength Sj of a student i is the 
sum of the weight of each edge attached to i. Therefore, 
Si = a.ijWij, where a.y is the component of the adja- 
cent matrix; its value is 1 if an edge is connected between 
vertices i and j and otherwise. Wij is the weight of the 
edge between i and j. The strength and degree of a stu- 
dent exhibit a scaling behavior s(k) ~ k® with [3 w 1.4; 
however, the fluctuation is quite strong, particularly for 
a small k [Fig. 2(b)]. The strength distribution exhibits 
a behavior that is similar to that of the degree distribu- 
tion; however, the value of the cutoff is larger[Fig. 2(a)]. 
The nonlinear relationship between s and k implies that 
the hub members tend to post messages at considerably 
more frequently than the other people, as is evident in 
Table II. 

Other standard measures of network topology are also 
obtained. The local clustering coefficient Cj is the local 
density of transitive relationships, defined as the number 
of triangles formed by its neighbors, cornered by itself, 
i, divided by the maximum possible number of these, 
ki(ki — l)/2. The average of Cj over vertices with a given 
degree k is referred to as the clustering function C(k). 
For the student network, C(k) decays as ~ k~ 5 for large 
k, and its weighted version defined in Ref. ^il 1 behaves 
as C( w '(h) ~ fc~ ' 3 , as shown in Fig. 2(c). The clustering 
coefficient C, which is the average of Cj over all vertices 
with k > 1, is « 0.48. This is one order of magnitude 
greater than C ran dom ~ 0.04 of its typical randomized 
counterpart with an identical degree sequence [22^. The 
average nearest- neighbor degree function k nn (k), which 
is defined by the average degree of the neighbors of ver- 
tices of degree k, is almost flat for the student network; 
nevertheless, its weighted version defined in pl| shows a 
slightly upward curvature for large k (Fig. 2(d)). The as- 
sortativity coefficient [2j| for the binary network and the 
Spearman rank correlation of the degrees are measured 
to be close to zero, as r ~ 0.011 and ^spearman ~ 0.024, 
respectively. This almost neutral mixing, which is in con- 




1 In Ref. lid , the local weighted clustering coefficient was defined 

as cV°' = J2j,h( w ij + v>ih)aiia ih a,j h /[2si(ki - 1)]. C^^fc) is 

the average of c'™' over vertices with degree k. The weighted 
average nearest-neighbors degree of vertex i was defined as 

fcnn'i = Ej^Ll aijWijkj/si. ki^(k) is the average of fc£^ over 
the vertices with degree k. 



FIG. 2: Structure of the BBS network, (a) The degree distri- 
bution Pd(k) (o) and the strength distribution P a (s) (o) of the 
entire network. The straight line is a guideline with a slope 
of —1. (b) The degree-strength scaling relation s(k). The 
straight line is a guideline with a slope of 1.4. (c) The clus- 
tering function C(k) (o) and its weighted version (o). The 
straightlines are guidelines with slope of —0.5 (lower) and 
—0.3 (upper), respectively, (d) The average nearest-neighbor 
degree function k nu (k) and its weighted version (o). (e) The 
correlation between the degree and the membership number 
B. The dotted line is a guideline with a slope of 1. (f) The 
membership number distribution of the vertices Pa (B) , where 
B is the number of boards that a student participates in. The 
straight line is a guideline with a slope of —1. 



trast to the common belief that social networks are as- 
sortative. has also been observed in another online social 
network |18| . 

The number of boards that a student participates in 
is likely to be larger for students with a larger degree, as 
shown in Fig. 2(e). Its distribution follows a skewed func- 
tional form in Fig. 2(f). These results imply an important 
fact that a group of people with a large degree tend to 
participate in diverse dialogues on different boards and 
will play a dominant role in drawing social consensus on 
diverse issues. Moreover, they work as mediators between 
different groups in an online social community. 

The betweenness centrality (BC) or load [2i I2I . 
which is defined as the effective number of paths or pack- 
ets passing through a given vertex when every pair of 
vertices gives and receives information, is also measured. 
The BC distribution follows a power law with an expo- 
nent » 2.2, as shown in Fig. 3(a) and the BC of a given 
vertex £ is strongly correlated to its degree k as £ ~ k 16 
as shown in Fig. 3(b). This implies that the hub mem- 
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FIG. 3: (a) The betweenness centrality (BC) distribution of 
the BBS network. The dotted line is a guideline with a slope 
of —2.2. (b) The relation between BC (£) and degree (k) of 
the BBS network. The dotted line is a guideline with a slope 
of 1.6. 



bers have a large BC and have a strong influence on the 
remaining people. 

In other words, the student network is extremely het- 
erogeneous, highly clustered, and yet, almost neutrally 
mixed, thereby exhibiting a strong nonlinear relationship 
between the strength and degree. 



B. Board network 

The procedure for constructing the board network is 
similar to the usual projection method of the bipartite af- 
filiation network. We create a link between two boards if 
they share at least one common member. In other words, 
each student participating in more than one board con- 
tributes a complete subgraph — a clique — to the board 
network. Thus, the board network is the superposition 
of cliques, each of which originates from the crossboard 
activities of a student. Such crossboard activities will 
provide channels for information transmission across the 
boards. In order to assign meaningful weights to these 
channels, all the links in each clique are assigned a weight 
that is equal to the inverse of the number of vertices in 
that clique. In other words, the communication chan- 
nels created by the students posting on fewer boards are 
stronger. Therefore, the weight of an edge between two 
boards increases with the number of co-members; how- 
ever, the contributions of "ubiquitous persons" would 
only be moderate. The strength of a board is the sum 
of the weights of its edges. Such a strength distribution 
along with the degree distribution, which does not ac- 
count for the weight, is shown in Fig. 4(a). The relation 
between the strength and degree is shown in Fig. 4(b). 

The board network is quite highly clustered with a 
clustering coefficient of « 0.61, and the clustering func- 
tion decreases with k [Fig. 4(c)]. However, it is notewor- 
thy that such a high clustering may result from the gen- 
eration of cliques by the projection procedure. Moreover, 
even the randomized board network has a clustering co- 
efficient as high as « 0.48. The average nearest-neighbor 
degree initially increases with k but decreases for larger 
k. However, its weighted version increases monotonically 
with k, as shown in Fig. 4(d). 



TABLE I: Statistics of the BBS network as of 
The numbers in parentheses are the statistics 
logues. 

Number of students N 

Number of links L 

Number of dialogues W 

Number of boards G 

Size of the largest cluster Ni 

Average size of the boards S 

Average board memberships of a student B 

Average path length D 

Mean degree (k) 



November 2004. 
for non-self dia- 



7446 (7421) 
103498 (103473) 
1299397 (1267292) 
1908 (1872) 
7350 

32.0 (32.6) 

8.2 

3.3 

27.8 (27.9) 



TABLE II: The fraction of the dialogues contributed by hub 
members with a degree larger than 80 in the first ten longest 
threads. The degree value of 80 is chosen approximately in 
Fig. 2(a); beyond this degree, the power law for the degree dis- 
tribution fails. 



Rank 


Thread length 


Number of dialogues 


Fraction 






contributed by hub members 


(%) 


1 


229 


181 


79 


2 


121 


70 


58 


3 


92 


92 


100 


4 


74 


45 


61 


5 


67 


16 


24 


6 


66 


45 


68 


7 


65 


27 


41 


8 


64 


34 


53 


9 


54 


54 


100 


10 


50 


50 


100 




FIG. 4: Structure of the board network, (a) The degree dis- 
tribution Pd(k) (o) and strength distribution P s (s) (o) of the 
board network, (b) The degree-strength relation in the board 
network. The straight line is a guideline with a slope of 1. (c) 
The clustering function C(k) (o) and its weighted version (o). 
(d) The average nearest-neighbor degree function k nn {k) (o) 
and its weighted version (o). 
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FIG. 5: Properties of the board sub-network, (a) The degree 
distributions of subnetworks within the five largest boards. 
Symbols used are (o), (A), (o), (□), and (V) in the decreas- 
ing order of board size. The fitted curves with the Gamma 
distribution k a ~ 1 e~ k/b /\T(a)b a ] are shown, (b) The degree 
distributions of subnetworks within the five largest boards 
with degree redefined as discussed in the text, (c) The size 
distribution of the boards Pm(M). The straight line is a 
guideline with a slope of —0.7. (d) The link density A(M) 
within a board as a function of its size M. The straight line 
is a guideline with a slope of —0.65. 



FIG. 6: Evolution of the BBS network, (a) The temporal evo- 
lution of the number of students N (solid), number of links 
L (dashed), total number of dialogues W (dotted), and num- 
ber of boards G (dot-dashed), (b) The same plot as (a) in 
the double logarithmic scale, (c) The evolution of the degree 
distribution Pd(k) of the student network. The degree distri- 
bution for each year is shown. The symbols (o), (o), (A), and 
( V) correspond to each year from 2001 to 2004, respectively, 
and (□) represents the final configuration, (d) The clustering 
function C(k) for each year. The same symbols as those in 
(c) are used. 



V. STUDENT NETWORK WITHIN A BOARD 

Upon examining the networks within a board, we 
were presented with a different scenario. As shown in 
Fig. 5(a), the degree distributions of the student net- 
works within the boards are rather homogeneous. They 
exhibit a peak followed by an exponential tail, which 
overall fits well into the Gamma distribution. Here, the 
degree k must be specified in further detail. Consider a 
case where two students A and B on a given board who 
do not communicate directly with each other. However, 
this communication between A and B can occur on a dif- 
ferent board. In this case, the two students are regarded 
to be connected for the definition of degree in Fig. 5(a). 
When such a pair is regarded to be disconnected, the de- 
gree kg is redefined and its distribution exhibits fat tails, 
as shown in Fig. 5(b); this was also observed in another 
BBS system. 

The size of the board, which denotes the number of 
students posting messages on it, has a broad distribution 
[Fig. 5(c)]- a power law followed by a rapidly decaying 
tail. The edge density A inside a given board scales with 
its size M as A(M) ~ M~ 0S5 , as shown in Fig. 5(d). 
Such a behavior cannot be observed in the random sam- 
pling of populations of different sizes, thereby indicat- 
ing that the communications between students are in- 
deed strongly constrained within each board rather than 
across them. Further, the power-law scaling behavior 
suggests that the BBS network is organized in a self- 
similar manner. From this result, it is evident that the 



usual projection method involving the creation of cliques 
by bipartite affiliation graphs cannot provide an appro- 
priate description of the BBS system. Moreover, such a 
size-dependent scaling of edge density within groups has 
not been realized thus far in a simple model of a clustered 
network j^. 



VI. EVOLUTION OF THE BBS NETWORK 

The daily record of the BBS network also allows us 
to examine the temporal evolution of the network. The 
number of vertices (students) N grows exponentially af- 
ter the transient period; however, the continuously mod- 
erated growth rate appears to attain a steady state 
[Fig. 6(a)]. Similar behavior is observed in the case of 
the number of links L and the number of dialogues W. 
The number of boards G grows at a rather steady rate 
over the period. 

Despite its continuous evolution, the structural prop- 
erties of the network seem to be in a stationary state. 
In other words, the overall network characteristics such 
as the degree distribution and clustering function achieve 
their forms in the initial period (after ~1 year), and do 
not change considerably with time, as shown in Figs. 6(c) 
and (d). The crossover time scale of approximately 1 
year can also be observed in terms of the evolution of 
the number of vertices N: Their growth patterns change 
qualitatively after ~10 months, as seen in Figs. 6(a) and 
(b). 
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VII. SIMPLE MODEL 

Having identified the main statistical characteristics of 
the BBS network, we attempt to understand them from 
the perspective of a simple network model. First, we 
consider a simple extension of the model of a clustered 
network introduced by Newman 28] ■ The original model 
of Newman is specified with two fundamental probability 
distributions, r m and sm- r m represents the probability 
that an individual belongs to m groups [Pb(B) in our 
notation; (see Fig. 5(d))] and s&r, the probability that 
the group size is M \Pm{M) in our notation]. By assum- 
ing that the link density within the groups is given by a 
constant parameter p, it is possible to obtain several of 
formulae for the network structure using the generating 
function method. For example, the degree distribution 
of the network can be written as follows: 



P d (k) 



1 d k 
kl dz k 



fo[gi(pz + q)} 



(1) 



as shown in Fig. 5(d). In fact, by simply applying this 
model with the average link density p « 0.3 along with 
r m and sm, directly measured from the data, the de- 
gree distribution of the BBS network cannot be repro- 
duced. Therefore, we modify the model by allowing p 
to vary across the group, based on the empirical formula 
A(M) ~ Af~ - 65 . Such a modification complicates the 
mathematical formulae and they must be solved numer- 
ically. The resulting degree distribution of the modified 
model along with that of the real data is shown in Fig. 7. 
Although it is imperfect, the agreement improved signif- 
icantly. Thus, it is crucial to incorporate the nonuniform 
link density into the realistic modeling of the BBS net- 
work. 

The manner in which the group size distribution, 
group membership distribution, and group density 
scaling, which are the input parameters of the model, 
achieve their present forms, as shown in Figs. 5(c) and 
(d), is a topic for future study. 



where fo(z) and g\(z) are appropriate generating func- 
tions defined as f Q (z) = Y.m=Q r ™z m and g 1 (z) = 
W _1 Em=o Ms m z m ~\ and q = 1 - p. 

However an obvious shortcoming of the model is that 
in real data, the link densities are not uniform across 
the boards and they strongly depend on the board size, 
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